A Learning‐Rate Modulable and Reliable TiO x Memristor Array for Robust, Fast, and Accurate Neuromorphic Computing

Abstract Realization of memristor‐based neuromorphic hardware system is important to achieve energy efficient bigdata processing and artificial intelligence in integrated device system‐level. In this sense, uniform and reliable titanium oxide (TiO x ) memristor array devices are fabricated to be utilized as constituent device element in hardware neural network, representing passive matrix array structure enabling vector‐matrix multiplication process between multisignal and trained synaptic weight. In particular, in situ convolutional neural network hardware system is designed and implemented using a multiple 25 × 25 TiO x memristor arrays and the memristor device parameters are developed to bring global constant voltage programming scheme for entire cells in crossbar array without any voltage tuning peripheral circuit such as transistor. Moreover, the learning rate modulation during in situ hardware training process is successfully achieved due to superior TiO x memristor performance such as threshold uniformity (≈2.7%), device yield (> 99%), repetitive stability (≈3000 spikes), low asymmetry value of ≈1.43, ambient stability (6 months), and nonlinear pulse response. The learning rate modulable fast‐converging in situ training based on direct memristor operation shows five times less training iterations and reduces training energy compared to the conventional hardware in situ training at ≈95.2% of classification accuracy.

increased, the PSC change increased at the same pulse number. Likewise, as the magnitude of V d increased, the PSC decreased, but no significant variation was detected above 1.4 V. Note that the relative difference in potentiation voltage magnitude ( Figure S5-1a) to obtain PSC change is remarkably lower than that in pulse width ( Figure S5-1c), indicating energy efficiency of programming pulse amplitude modulation for learning rate adjustment. When the pulse width was decreased with fixed other parameters (V p = -1.55 V, V d = 1.4 V, and ∆t = 800 ms), the statistical parameter got worse where error value was ~7.68 % for 100 ms (black line), ~7.63 % for 50 ms (red line), and 10.0 % for 30 ms (blue line). However, by using the modulation of the V p and t, it is also possible to reduce the pulse width with considerably maintaining (or weakening) PSC error value. In case of 10 ms of pulse width, the reliable LTP/LTD curves were obtained at -2.7 V of V p , and the error showed ~12.8 % for 800 ms of the pulse period (magenta line). But when the pulse period was reduced to 100 ms, the error value was also reduced as ~6.15 % (green line) which is relevant to frequencydependent synaptic plasticity in Figure 1h. Consequently, although there is a tradeoff issue between switching speed and uniformity, the reduction of statistical uniformity can be considerably mitigated by using modulation of pulse programming scheme (magnitude or period) in terms of frequency-dependent synaptic plasticity for practical application. Figure S6. Dependence of the voltage applying scheme on the SET, RST, and READ processes. a) SET process. b) RST process. c) READ process. Initially, all rows and columns are connected to V CM . When the device on the first row and column is under the SET (RST) process, the first row is connected to V SET (V RST ), and all columns except the first one are connected to V SET,H (V RST,H ). After t SET (t RST ) seconds, all columns and rows are connected to V CM again. To read the conductance of the target device highlighted in red, the first row is connected to V RN or V RP depending on the polarity of the array, while the columns are connected to the input node of the TIAs, which is a virtual ground. Each reference voltage is actually implemented on a 2.5 V V CM basis, which is described as 0 V in the main text for convenience. The R F is determined by considering the current from the memristors and the output swing range of the TIA ([0 V, 5 V]) during the programming process for 1 device and the inference process for 50 devices. When the system is under the programming process for 1 device, the value of R F is 250 KΩ because the maximum conductance range in the LRS per device is approximately 20 μS. On the other hand, when the system is under the inference phase, in which the current is generated from multiple devices, the value of R F is determined by 10 KΩ, considering the worst-case pattern. Schematic diagrams of the measurement setup. In the measurement of the positive array, only V RP is applied to the positive array, while V CM is applied to the negative arrays. In the case of the negative array, only V RN is applied. The input pattern increases cumulatively from 1 to 25, in which the numbers represent the number of rows connected to V RP or V RN . The conductance state of all memristors is set as the HRS, and the positive and negative arrays are measured separately. c,d) Measurement results of the positive and negative arrays. The x-axis represents the number of rows connected to V RP or V RN , and the y-axis represents the summed current.
The measurement condition is that all memristors are in the HRS, in which conductance is distributed at approximately 2 μS. The results show the linear and uniform VMM performance, which depends on the gradually increasing input patterns.         conductance error for each cell and training effectiveness of pulse number modulation. The histogram indicates that FCIS training using a high rate of conductance change at the initial iteration has high training efficiency with a relatively fewer number of pulses than required in normal in-situ training. The number of writing pulses is converted from the amount of weight update, ΔW, determined in equation (5) in the main text through the LUT (Table S1).  ms and 3 ms of pulse width, respectively. d) The total number of required training pulses at the initial, 500 th , 2,000 th , 4,000 th , and 8,000 th iteration. In our study, the switching constraint of the memristor array device was determined to exhibit most appropriate analog switching properties (for example: dynamic range, asymmetry, and repetitive endurance) for high performance of hardware in-situ training.            (5)  Given the same number of training pulses and iteration required for training, the total energy consumption for training is expected to be proportional to the pulse width of the training pulse.
However, in terms of power efficiency, since only the number of iteration and training pulse affect the performance with the fixed pulse amplitude, the power efficiency with 500 ms pulse width is 12 times higher than that with 3 ms pulse width.     for the memristor device fabricated by the reactive sputtering method, the peak shift was more remarkable, and the relevant device entirely lost its resistive switching properties, as shown in Figure S1c (magenta line). Here, switching degradation was also observed in the UVO-treated TiO x memristor device, and its degradation deteriorated when the duration of UVO treatment was increased from 1 to 10 min. This result indicates that excessive oxygen treatment beyond the lattice value diminishes the switching performance due to a considerably deficient V o concentration. Figure S1d shows the XPS depth-profiling analysis of the memristor device after UVO treatment for 10 min, presenting an x value of ~2.17 at the TiO x switching layer, which is obviously distinguishable to the optimized value of ~2.03 (Figure 1c). As a result, we verified the theoretical basis that x value should approach to the lattice level of 2 for reliable resistive switching in the manner of array level fabrication and electrical investigation of the TiO x memristor device. [1,5,7] This stoichiometric stabilization of the TiO x memristive layer to prevent an excessive or a deficient V o concentration provides the basis of large-scale passive memristor array to be constituent device element for hardware neural network system.
In addition to this, V o -based switching mechanism was also investigated by using transmission electron microscopic (TEM) analysis for each resistive state of the device as shown in Figure S1e. Here, the specific element distribution can be observed with intensity which denotes the probability density function of y = x n ( Figure S2e). With this, the probability (P) for a specific range of x can be calculated, which is dependent on the n value.
Then, the probability of yield beyond specific value of p can be calculated by conventional Bayes' theorem [9,10] as follows: From n = 2 (black line) to n = 120 (green line), the curvature of x n gets dramatically increased, resulting in increase of probability for the higher yield near ~100 % (colored box). When p is fixed, the P (p ≤ x ≤ 1) can be plotted as a function of n ( Figure S2f), where the reliability for the device operational yield can be enhanced by the increased evaluation numbers. For example, the probability for yield ≥ 90 % reaches as ~100 % when n ≈ 60 (black line).
Similarly, the probability values for yield ≥ 95, 98, and 99 % are close to ~100 % when n ≈ 120, 240, and 360, respectively. This indicates the statistical reliability of the TiO x memristive device in the array level when sufficient number of device was evaluated showing switching property. In other words, the evaluated number of devices affects the reliability of device operational yield, and in our measurement procedure, more than 99 % of device yield can have assurance with high probability.

Supporting Note 3. Analog switching properties of the TiO
x memristor device Figure S3a shows the PSC response of the TiO x memristor device when the input pulse (V p = 1.55 V for t w = 150 ms) was sequentially accumulated over a period of Δt = 800 ms on the presynaptic neuron (top Al electrode). When we measured the PSC of the device at 0.5 V before and after a single V p pulse train, the PSC gradually increased, indicating that the conductance through the V o -based switching filament can be gradually increased by the potentiation pulse train. Additionally, the intermediate conductance states were well maintained before applying the subsequent V p pulse ( Figure S3a and Figure 1i), resulting in gradual uplift of the LTP process. Therefore, the PSC modulation behavior in this study successfully emulated an important function of biological synaptic plasticity. [11][12][13] Figure S3b shows the LTP/LTD characteristics of the TiO x memristor device according to subsequent oxygen treatments (pristine, UVO of 1 and 10 min), where each 25 steps of potentiation pulse (V p = 1.55 V) and depression pulse (V d = 1.4 V) for t w = 150 ms at Δt = 800 ms were applied. As discussed in Supporting Note 1, a reduction in the dynamic range was similarly observed after excessive UVO treatment on the TiO x layer.

Supporting Note 4.
Nonlinear pulse response of the TiO x memristor device As shown in the V p variation for the LTP/LTD curves ( Figure S5-1a and Figure 3e), a small change in pulse voltage can dramatically change the PSC value, that is, the nonlinear pulse response of the TiO x memristor device. This nonlinear current transport, which depends on the magnitude of the V p pulse, is minutely presented in the inset of Figure 1g, where ∆G avg in the potentiation process nonlinearly decreased as V p decreased. Indeed, the nonlinear pulse response is known to be a general feature in a redox-based TiO x memory driven by V o filament transport. [14] As the V o filament is formed through the TiO x layer in the potentiation pulse, the charge carrier can migrate along the neighboring V o trap rather than via metallic transport. [3,5] Therefore, the major transport mechanism could be associated with the trapassisted space-charge-limited conduction in the TiO x layer, presenting a rapid reduction in the PSC at low potentiation voltage, expressed as I ~ V n (n ≈ 2), rather than ohmic behavior in conventional ECM transport. [8,15] According to this, the current value undergoes nonlinear voltage dependency, which in turn results in a significant reduction in the PSC at low potentiation voltage. Along with the reduction of unselected line transport in passive matrix array structure, the nonlinear characteristics are important for the remarkable modulation of the conductance change using a small variation in the potentiation voltage, which can be utilized to adjust the learning rate in a hardware-integrated neural network system (Figures 3e   and 3f). Note that I-V nonlinearity does not affect the read process, because the binary input, which is represented either by 0 or V R , is used for this work (see Methods).
Supporting Note 5. Detailed information on the hardware implementation and system automation An analog switching matrix of a PCB board is implemented by MAX14661 16:2 and MAX14763 2:1 multiplexers. During the write and read programming, [16] the former selects the column and row of the target cell, while the latter selects the reference voltages to be

Supporting Note 6. Clothes data set
To evaluate the training performance of the TiO x memristor array device integrated in the hardware system, we tested the Clothes data set [17] via the same quantization process as used for the MNIST data set. As shown in Figures S12a and S12b, the analog magnitude (256-level) in each 28 × 28 pixel image was scaled down to 8-level resolution, and then the images were In addition to the energy efficiency of the in-situ training, the tolerance for the device nonideality of the in-situ training can be also quantified through the simulation. Figure S22 shows the performance improvement of in-situ training for device non-idealities; conductance variation with programming error, stuck-ON, and stuck-OFF. The simulation for programming error is assumed that the programming failure occurs as much as a percentage of the error in the target conductance change in the learning phase. The result in Figure S22a shows that in-situ training is more tolerant to the programming error than the transfer learning, depending on the programmed conductance state. To clarify the retention characteristics, the PSC ratio is introduced, which is a ratio of the conductance after retention divided by the initial conductance. Figure S23c shows the simulation result that all devices of CNN have uniform retention properties regardless of the conductance value. Since uniform conductance retention only affects overall conductance range while the ratio of conductance is maintained over the devices, the classification accuracy is recorded over 95 % despite of 60 % PSC ratio.
In fact, because the VMM between input image data (read voltage) and encoded synaptic weight (memristor conductance) has importance for their ‗relative' proportion and ordering of output activation function values, the absolute magnification is not the critical factor for the artificial neural network system. In this simulation, the PSC ratio in Figure S23a is applied as an average value of all conductance states, which is ~56.2 %, and 95.5 % classification accuracy is recorded. In non-uniform retention simulation, the PSC ratio is applied to the conductance differently according to the conductance range as a normalized ratio, and the degree of the non-uniformity is evaluated by the minimum value of the normalized ratio. The simulation result of measured retention characteristics of the device is close to the result that the minimum value of normalized ratio is 0.8 at 95.2 % of classification accuracy. In addition, when the PSC ratio is reduced as the retention becomes worse after a few days, the degraded conductance states can be retrained. As shown in Figure S23e, we actually performed the hardware neuromorphic training process with ‗long time interval' after the suspension of training step. During the training process, when the iteration number reached 3,000, we stopped all training process and left the situation as it is while 3 days (~72 hours). After that, we restarted it and observed that loss function was well decreased regardless of 3 days of memristor device retention time. The loss function value at 4,000 th iteration was measured as 0.07, which is corresponded to the accuracy of ~95.2 % based on results in Figures 5a and b (main text). Therefore, we concluded that the synaptic weight map of memristor conductance configuration is well maintained even in 3 days of retention time enough to present the original accuracy, indicating tolerance for retention property of in-situ training.

Supporting Note 10. Feasibility of backpropagation using convolution module
As shown in Figure S26 The image -2‖ is inferred after every weight updating of one iteration is completed. Finally, the image at the lower position shows the change in the loss function with the training iteration.